{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "rFiCyWQ-NC5D"
   },
   "source": [
    "# Ungraded Lab: Using Convolutional Neural Networks\n",
    "\n",
    "In this lab, you will look at another way of building your text classification model and this will be with a convolution layer. As you learned in Course 2 of this specialization, convolutions extract features by applying filters to the input. Let's see how you can use that for text data in the next sections."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "812DOIF9qUtj"
   },
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "import tensorflow_datasets as tfds\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import keras_nlp"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "djvGxIRDHT5e"
   },
   "source": [
    "## Download and prepare the dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Y20Lud2ZMBhW"
   },
   "outputs": [],
   "source": [
    "# The dataset is already downloaded for you. For downloading you can use the code below.\n",
    "imdb = tfds.load(\"imdb_reviews\", as_supervised=True, data_dir=\"../data/\", download=False)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "1KwENtXmqk0v"
   },
   "outputs": [],
   "source": [
    "# Extract the train reviews and labels\n",
    "train_reviews = imdb['train'].map(lambda review, label: review)\n",
    "train_labels = imdb['train'].map(lambda review, label: label)\n",
    "\n",
    "# Extract the test reviews and labels\n",
    "test_reviews = imdb['test'].map(lambda review, label: review)\n",
    "test_labels = imdb['test'].map(lambda review, label: label)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "AW-4Vo4TMUHb"
   },
   "outputs": [],
   "source": [
    "# Download the subword vocabulary\n",
    "!wget https://storage.googleapis.com/tensorflow-1-public/course3/imdb_vocab_subwords.txt"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "HQFqE7fnqpYu"
   },
   "outputs": [],
   "source": [
    "# Initialize the subword tokenizer\n",
    "subword_tokenizer = keras_nlp.tokenizers.WordPieceTokenizer(\n",
    "    vocabulary='./imdb_vocab_subwords.txt'\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "GRmW9GX2qyfv"
   },
   "outputs": [],
   "source": [
    "# Data pipeline and padding parameters\n",
    "SHUFFLE_BUFFER_SIZE = 10000\n",
    "PREFETCH_BUFFER_SIZE = tf.data.AUTOTUNE\n",
    "BATCH_SIZE = 256\n",
    "PADDING_TYPE = 'pre'\n",
    "TRUNC_TYPE = 'post'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "zYrAfevOq0XK"
   },
   "outputs": [],
   "source": [
    "def padding_func(sequences):\n",
    "  '''Generates padded sequences from a tf.data.Dataset'''\n",
    "\n",
    "  # Put all elements in a single ragged batch\n",
    "  sequences = sequences.ragged_batch(batch_size=sequences.cardinality())\n",
    "\n",
    "  # Output a tensor from the single batch\n",
    "  sequences = sequences.get_single_element()\n",
    "\n",
    "  # Pad the sequences\n",
    "  padded_sequences = tf.keras.utils.pad_sequences(sequences.numpy(), truncating=TRUNC_TYPE, padding=PADDING_TYPE)\n",
    "\n",
    "  # Convert back to a tf.data.Dataset\n",
    "  padded_sequences = tf.data.Dataset.from_tensor_slices(padded_sequences)\n",
    "\n",
    "  return padded_sequences"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Y92GGi4hq2Bm"
   },
   "outputs": [],
   "source": [
    "# Generate integer sequences using the subword tokenizer\n",
    "train_sequences_subword = train_reviews.map(lambda review: subword_tokenizer.tokenize(review)).apply(padding_func)\n",
    "test_sequences_subword = test_reviews.map(lambda review: subword_tokenizer.tokenize(review)).apply(padding_func)\n",
    "\n",
    "# Combine the integer sequence and labels\n",
    "train_dataset_vectorized = tf.data.Dataset.zip(train_sequences_subword,train_labels)\n",
    "test_dataset_vectorized = tf.data.Dataset.zip(test_sequences_subword,test_labels)\n",
    "\n",
    "# Optimize the datasets for training\n",
    "train_dataset_final = (train_dataset_vectorized\n",
    "                       .shuffle(SHUFFLE_BUFFER_SIZE)\n",
    "                       .cache()\n",
    "                       .prefetch(buffer_size=PREFETCH_BUFFER_SIZE)\n",
    "                       .batch(BATCH_SIZE)\n",
    "                       )\n",
    "\n",
    "test_dataset_final = (test_dataset_vectorized\n",
    "                      .cache()\n",
    "                      .prefetch(buffer_size=PREFETCH_BUFFER_SIZE)\n",
    "                      .batch(BATCH_SIZE)\n",
    "                      )"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "nfatNr6-IAcd"
   },
   "source": [
    "## Build the Model\n",
    "\n",
    "In Course 2, you were using 2D convolution layers because you were applying it on images. For temporal data such as text sequences, you will use [Conv1D](https://www.tensorflow.org/api_docs/python/tf/keras/layers/Conv1D) instead so the convolution will happen over a single dimension. You will also append a pooling layer to reduce the output of the convolution layer. For this lab, you will use [GlobalMaxPooling1D](https://www.tensorflow.org/api_docs/python/tf/keras/layers/GlobalMaxPool1D) to get the max value across the time dimension. You can also use average pooling and you will do that in the next labs. See how these layers behave as standalone layers in the cell below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Ay87qbqwIJaV"
   },
   "outputs": [],
   "source": [
    "# Parameters\n",
    "BATCH_SIZE = 1\n",
    "TIMESTEPS = 20\n",
    "FEATURES = 20\n",
    "FILTERS = 128\n",
    "KERNEL_SIZE = 5\n",
    "\n",
    "print(f'batch_size: {BATCH_SIZE}')\n",
    "print(f'timesteps (sequence length): {TIMESTEPS}')\n",
    "print(f'features (embedding size): {FEATURES}')\n",
    "print(f'filters: {FILTERS}')\n",
    "print(f'kernel_size: {KERNEL_SIZE}')\n",
    "\n",
    "# Define array input with random values\n",
    "random_input = np.random.rand(BATCH_SIZE,TIMESTEPS,FEATURES)\n",
    "print(f'shape of input array: {random_input.shape}')\n",
    "\n",
    "# Pass array to convolution layer and inspect output shape\n",
    "conv1d = tf.keras.layers.Conv1D(filters=FILTERS, kernel_size=KERNEL_SIZE, activation='relu')\n",
    "result = conv1d(random_input)\n",
    "print(f'shape of conv1d output: {result.shape}')\n",
    "\n",
    "# Pass array to max pooling layer and inspect output shape\n",
    "gmp = tf.keras.layers.GlobalMaxPooling1D()\n",
    "result = gmp(result)\n",
    "print(f'shape of global max pooling output: {result.shape}')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "lNNYF7tqO7it"
   },
   "source": [
    "You can build the model by simply appending the convolution and pooling layer after the embedding layer as shown below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "jo1jjO3vn0jo"
   },
   "outputs": [],
   "source": [
    "# Hyperparameters\n",
    "EMBEDDING_DIM = 64\n",
    "FILTERS = 128\n",
    "KERNEL_SIZE = 5\n",
    "DENSE_DIM = 64\n",
    "\n",
    "# Build the model\n",
    "model = tf.keras.Sequential([\n",
    "    tf.keras.Input(shape=(None,)),\n",
    "    tf.keras.layers.Embedding(subword_tokenizer.vocabulary_size(), EMBEDDING_DIM),\n",
    "    tf.keras.layers.Conv1D(filters=FILTERS, kernel_size=KERNEL_SIZE, activation='relu'),\n",
    "    tf.keras.layers.GlobalMaxPooling1D(),\n",
    "    tf.keras.layers.Dense(DENSE_DIM, activation='relu'),\n",
    "    tf.keras.layers.Dense(1, activation='sigmoid')\n",
    "])\n",
    "\n",
    "# Print the model summary\n",
    "model.summary()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Uip7QOVzMoMq"
   },
   "outputs": [],
   "source": [
    "# Set the training parameters\n",
    "model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "iLJu8HEvPG0L"
   },
   "source": [
    "## Train the model\n",
    "\n",
    "Training will take around 30 seconds per epoch and you will notice that it reaches higher accuracies than the previous models you've built."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "7mlgzaRDMtF6"
   },
   "outputs": [],
   "source": [
    "NUM_EPOCHS = 10\n",
    "\n",
    "# Train the model\n",
    "history = model.fit(train_dataset_final, epochs=NUM_EPOCHS, validation_data=test_dataset_final)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "Mp1Z7P9pYRSK"
   },
   "outputs": [],
   "source": [
    "def plot_loss_acc(history):\n",
    "  '''Plots the training and validation loss and accuracy from a history object'''\n",
    "  acc = history.history['accuracy']\n",
    "  val_acc = history.history['val_accuracy']\n",
    "  loss = history.history['loss']\n",
    "  val_loss = history.history['val_loss']\n",
    "\n",
    "  epochs = range(len(acc))\n",
    "\n",
    "  fig, ax = plt.subplots(1,2, figsize=(12, 6))\n",
    "  ax[0].plot(epochs, acc, 'bo', label='Training accuracy')\n",
    "  ax[0].plot(epochs, val_acc, 'b', label='Validation accuracy')\n",
    "  ax[0].set_title('Training and validation accuracy')\n",
    "  ax[0].set_xlabel('epochs')\n",
    "  ax[0].set_ylabel('accuracy')\n",
    "  ax[0].legend()\n",
    "\n",
    "  ax[1].plot(epochs, loss, 'bo', label='Training Loss')\n",
    "  ax[1].plot(epochs, val_loss, 'b', label='Validation Loss')\n",
    "  ax[1].set_title('Training and validation loss')\n",
    "  ax[1].set_xlabel('epochs')\n",
    "  ax[1].set_ylabel('loss')\n",
    "  ax[1].legend()\n",
    "\n",
    "  plt.show()\n",
    "\n",
    "plot_loss_acc(history)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "0rD7ZS84PlUp"
   },
   "source": [
    "## Wrap Up\n",
    "\n",
    "In this lab, you explored another model architecture you can use for text classification. In the next lessons, you will revisit full word encoding of the IMDB reviews and compare which model works best when the data is prepared that way.\n",
    "\n",
    "As before, run the cell below to free up resources."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Shutdown the kernel to free up resources. \n",
    "# Note: You can expect a pop-up when you run this cell. You can safely ignore that and just press `Ok`.\n",
    "\n",
    "from IPython import get_ipython\n",
    "\n",
    "k = get_ipython().kernel\n",
    "\n",
    "k.do_shutdown(restart=False)"
   ]
  }
 ],
 "metadata": {
  "accelerator": "GPU",
  "colab": {
   "name": "C3_W3_Lab_3_Conv1D.ipynb",
   "private_outputs": true,
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.0rc1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
